6 research outputs found

    Secure data sharing and processing in heterogeneous clouds

    Get PDF
    The extensive cloud adoption among the European Public Sector Players empowered them to own and operate a range of cloud infrastructures. These deployments vary both in the size and capabilities, as well as in the range of employed technologies and processes. The public sector, however, lacks the necessary technology to enable effective, interoperable and secure integration of a multitude of its computing clouds and services. In this work we focus on the federation of private clouds and the approaches that enable secure data sharing and processing among the collaborating infrastructures and services of public entities. We investigate the aspects of access control, data and security policy languages, as well as cryptographic approaches that enable fine-grained security and data processing in semi-trusted environments. We identify the main challenges and frame the future work that serve as an enabler of interoperability among heterogeneous infrastructures and services. Our goal is to enable both security and legal conformance as well as to facilitate transparency, privacy and effectivity of private cloud federations for the public sector needs. © 2015 The Authors

    Students and Taxes: a Privacy-Preserving Social Study Using Secure Computation

    Get PDF
    We describe the use of secure multi-party computation for performing a large-scale privacy-preserving statistical study on real government data. In 2015, statisticians from the Estonian Center of Applied Research (CentAR) conducted a big data study to look for correlations between working during university studies and failing to graduate in time. The study was conducted by linking the database of individual tax payments from the Estonian Tax and Customs Board and the database of higher education events from the Ministry of Education and Research. Data collection, preparation and analysis were conducted using the Sharemind secure multi-party computation system that provided end-to-end cryptographic protection to the analysis. Using ten million tax records and half a million education records in the analysis, this is the largest cryptographically private statistical study ever conducted on real data

    DataBio Deliverable D4.4 – Service Documentation

    No full text
    The public deliverable D4.4 describes the software components and processes (here called pipelines as the processes mostly consist of Big Data volumes streaming through successive processing steps) to be utilized by the DataBio Platform and pilots. The pilot services were tested through two phases, Trial 1 and Trial 2 of the project. Most of the components were used in both Trials with some updates in their features for Trial 2. In addition, this deliverable reports which components were deployed in each pilot and the development platform that the pilots tested their Big Data solutions on. The document aggregates information dispersed among various deliverables (namely [REF-01] - [REF-06]). The aim of this deliverable is to create a comprehensive overview of DataBio technical results. The objective of WP4 “DataBio Platform with Pilot Support” was to configure and adopt Big Data technologies for agriculture, forestry, and fishery. The work package together with WP5 “Earth Observation and Geospatial Data and Services”, established a platform for the development of bioeconomy applications. The software and dataset repository DataBio Hub is a central resource of the platform. In doing so, WP4 supported the DataBio pilots in their needs for Big Data technologies. This deliverable starts with an overview of DataBio building blocks such as platform architecture, software components, datasets, models that offer functionalities primarily for services in the domains of agriculture, forestry, and fishery. Then follows the exploitation for the identification of cross reusable (sub) pipelines (“design patterns”) that can be used across the pilots of the project and can be applied to other domains. The pipelines are one of the major exploitable assets of DataBio. The generic sections of the deliverable are concluded by Chapter 4 that explains the integration of different components into a pipeline and the services that are provided per pilot. The main results for the pilot services and the component updates, from a technological aspect, for both trials 1 and 2 are presented. The concluding chapter outlines the main findings, lessons learned and emerging examples of best practices. The deliverable comprises contributions from the following tasks: • T4.1: DataBio Architecture Requirements • T4.2: Advanced Visualization Services • T4.3: Predictive Analytics and Machine Learning • T4.4: Real-time Analytics and Stream Processing • T4.5: Big Data Variety Management, Storage, Linked Data and Queries • T4.6: Big Data Acquisition and Curation with Security/Privacy Support • T5.1: EO Subsystem and Components • T5.2: EO Data Discovery and Data Management & Acquisition Services • T5.3: EO Data Processing, Extraction, Conversion and Fusion Services • T5.5: Meteo Data Managemen

    DataBio Deliverable D4.4 – Service Documentation

    No full text
    The public deliverable D4.4 describes the software components and processes (here called pipelines as the processes mostly consist of Big Data volumes streaming through successive processing steps) to be utilized by the DataBio Platform and pilots. The pilot services were tested through two phases, Trial 1 and Trial 2 of the project. Most of the components were used in both Trials with some updates in their features for Trial 2. In addition, this deliverable reports which components were deployed in each pilot and the development platform that the pilots tested their Big Data solutions on. The document aggregates information dispersed among various deliverables (namely [REF-01] - [REF-06]). The aim of this deliverable is to create a comprehensive overview of DataBio technical results. The objective of WP4 “DataBio Platform with Pilot Support” was to configure and adopt Big Data technologies for agriculture, forestry, and fishery. The work package together with WP5 “Earth Observation and Geospatial Data and Services”, established a platform for the development of bioeconomy applications. The software and dataset repository DataBio Hub is a central resource of the platform. In doing so, WP4 supported the DataBio pilots in their needs for Big Data technologies. This deliverable starts with an overview of DataBio building blocks such as platform architecture, software components, datasets, models that offer functionalities primarily for services in the domains of agriculture, forestry, and fishery. Then follows the exploitation for the identification of cross reusable (sub) pipelines (“design patterns”) that can be used across the pilots of the project and can be applied to other domains. The pipelines are one of the major exploitable assets of DataBio. The generic sections of the deliverable are concluded by Chapter 4 that explains the integration of different components into a pipeline and the services that are provided per pilot. The main results for the pilot services and the component updates, from a technological aspect, for both trials 1 and 2 are presented. The concluding chapter outlines the main findings, lessons learned and emerging examples of best practices. The deliverable comprises contributions from the following tasks: • T4.1: DataBio Architecture Requirements • T4.2: Advanced Visualization Services • T4.3: Predictive Analytics and Machine Learning • T4.4: Real-time Analytics and Stream Processing • T4.5: Big Data Variety Management, Storage, Linked Data and Queries • T4.6: Big Data Acquisition and Curation with Security/Privacy Support • T5.1: EO Subsystem and Components • T5.2: EO Data Discovery and Data Management & Acquisition Services • T5.3: EO Data Processing, Extraction, Conversion and Fusion Services • T5.5: Meteo Data ManagementpublishedVersio
    corecore